An Architecture for Scientific Document Retrieval: Using Textual and Math Entailment Modules

نویسندگان

  • Partha Pakray
  • Petr Sojka
چکیده

We present an architecture for scientific document retrieval. An existing system for textual and math-ware retrieval Math Indexer and Searcher MIaS is designed for extensions by modules for textual and math-aware entailment. The goal is to increase quality of retrieval (precision and recall) by handling natural languge variations of expressing semantically the same in texts and/or formulae. Entailment modules are designed to use several, ordered layers of processing on lexical, syntactic and semantic levels using natural language processing tools adapted for handling tree structures like mathematical formulae. If these tools are not able to decide on the entailment, generic knowledge databases are used deploying distributional semantics methods and tools. It is shown that sole use of distributional semantics for semantic textual entailment decisions on sentence level is surprisingly good. Finally, further research plans to deploy results in the digital mathematical libraries are outlined.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Methods for Textual Entailment

The problem of recognizing textual entailment (RTE) has been recently addressed using syntactic and lexical models with some success. Here, a new approach is taken to apply world knowledge in much the same way as humans, but captured in large semantic graphs such as WordNet. We show that semantic graphs made of synsets and selected relationships between them enable fairly simple methods that pr...

متن کامل

A Semantic Method for Textual Entailment

The problem of recognizing textual entailment (RTE) has been recently addressed using syntactic and lexical models with some success. Here, we further explore this problem, this time using the world knowledge captured in large semantic graphs such as WordNet. We show that semantic graphs made of synsets and selected relationships between them enable fairly simple methods that provide very compe...

متن کامل

Recognizing Textual Entailment via Multi-task Knowledge Assisted LSTM

Recognizing Textual Entailment (RTE) plays an important role in NLP applications like question answering, information retrieval, etc. Most previous works either use classifiers to employ elaborately designed features and lexical similarity or bring distant supervision and reasoning technique into RTE task. However, these approaches are hard to generalize due to the complexity of feature enginee...

متن کامل

A Framework for Entailed Relation Recognition

We define the problem of recognizing entailed relations – given an open set of relations, find all occurrences of the relations of interest in a given document set – and pose it as a challenge to scalable information extraction and retrieval. Existing approaches to relation recognition do not address well problems with an open set of relations and a need for high recall: supervised methods are ...

متن کامل

Multi-document Summarisation and the PASCAL Textual Entailment Challenge

A fundamental problem for systems that require natural language understanding capabilities is the identification of instances of semantic equivalence and paraphrase in text. The PASCAL Recognising Textual Entailment (RTE) challenge is a recently proposed research initiative that addressed this problem by providing an evaluation framework for the development of generic “semantic engines” that ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014